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ABSTRACT 


The  concept  of  a  frontier  production  has  been  attacked  on  the 
grounds  that  mismeasurement  of  output  makes  It  Impossible  to  separate 
efficient  from  Inefficient  firms,  i*e.,  what  looks  like  Inefficiency  may 
actually  be  mlsmeasurement  of  output*  In  this  paper,  we  Illustrate  one 
method  for  estimating  a  frontier  production  relation  when  output  is 
poorly  measured — leading  to  errors-in-the-variables •  The  technique, 
based  on  Goldberger*s  -{1 factor  analysis  model,  is  meant  to  avoid 
not  only  spurious  findings  of  inefficiency  but  also  an  overestimate  of 
scale  economies*  Our  empirical  example  involves  a  military  applica¬ 
tion:  U.S*  Naval  Bases*  In  this  example,  our  taking  account  of  the 
errors-in-variable  problem  does  not  decrease  the  indicator  of  average 
inefficiency*  It  does,  however,  substantially  reduce  the  measured 
economies  of  scale*  _ 
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I.  INTRODUCTION 


A  controversial  feature  of  frontier  production  models  is  that  some 
firms  are  labeled  "inefficient •  "  The  controversy  centers  on  the  source, 
nature,  and  even  the  existence  of  the  inefficiency*  Liebenstein  (1966) 
argues  that  some  firms  operate  below  their  own  frontier  and  uses  the 
phrase  X-efficiency  to  mean  less-than-ef f icient  operation.  Aigner, 
Lovell,  and  Schmidt  (1977)  write  of  technical  efficiency  in  much  the 
same  way.  Stigler  (1976)  and  Maddala  and  Fishe  (1979)  argue  that  there 
is  no  such  thing  as  inefficiency;  what  looks  like  inefficiency  is 
actually  the  result  of  unmeasured  differences  in  the  capacity  of 
different  entrepreneurs,  the  taste  for  leisure,  etc. 

In  econometric  terms,  the  Stigler  argument  is  that  what  is  usually 
called  inefficiency  is  actually  the  result  of  errors-in-variables . 

Inputs  and  outputs  are  measured  poorly.  Errors-in-variables  could 
create  two  problems:  (1)  a  biased  estimate  of  the  degree  of  inef¬ 
ficiency,  (2)  a  spurious  finding  of  scale  economies.  To  see  (2), 
suppose  that  production  technology  is  summarized  with  a  cost  curve. 
Errors  in  measuring  output  will  bias  downward  the  measured  effect  of 
output  on  cost  and,  hence,  bias  upward  the  measured  economies  of  scale. 

In  this  paper,  we  apply  Goldberger's  (1974)  technique  for  reducing 
errors-in-variables  and  examine  the  effect  on  the  measured  degree  of 
efficiency  and  the  measured  economies  of  scale.  The  data  are  drawn  from 
a  military  application:  costs,  and  "outputs"  for  a  large  cross-section 


* 
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of  U.S.  Navy  bases.  Both  efficiency  and  economles-of-scale  are  of  major 
Interest  to  the  Navy.  Measured  efficiency  of  different  bases  is  an  aid 
in  cost  control;  economies-of -scale  are  one  (of  many)  considerations  in 
base  consolidation. 

II.  A  FRONTIER  COST  CURVE  AND  ERRORS-IN-VARIABLES 

Consider  a  log-linear  multiple-output  cost  equation  of  the  form 

c  ■  a  +  YB  +  e  (3) 

where  c  is  a  N  x  1  vector  of  the  natural  log  of  costs  at  different 
cross-sectional  units,  Y  is  a  N  x  L  matrix  of  outputs  (each  measured 
in  logarithmic  terms)  and  e  is  a  N  x  1  vector  of  error  terms. 
Variation  in  input  prices  was  suppressed  because  all  measures  are  at  the 
same  point  in  time.  The  error  term  e  is  composed  of  two  parts,  (1) 
the  type  of  random  noise  usually  termed  statistical  error  (v^)  and  (2) 
inefficiency  (u^).  The  inefficiency  error  Includes  technical 
inefficiency  (deviation  from  the  production  function)  and  allocative 
inefficiency  (inappropriate  response  to  factor  prices). 

In  addition  to  the  two  components  of  e,  there  are  errors-in-the- 
variables.  What  the  researches  have  on  hand  is  not  really  Y,  but 
rather  X  ■  Y  +  W,  where  W  is  an  N  X  L  matrix  of  disturbances. 
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There  Is  standard  advice  on  how  to  handle  errors-in-variables :  to 
replace  the  OLS  estimator  of  B 


B  -  (X*X)“lX’c 


with 


B  -  [X’X  -  E(WW')]X'c  . 

For  example,  see  Theil  (1971,  pp.  610).  An  obvious  alternative  is  to 
use 


B*  -  (Y’Y)~l  Yc 


where  Y  ■  X  -  W.  Both  alternatives  have  remained  useless:  There  has 

A  A 

been  no  way  to  estimate  E(WW')  or  W. 


Factor  Analysis 

Under  specified  assumptions,  factor  analysis  can  be  used  to  make 

A 

the  necessary  estimate  of  W.  This  is  the  essence  of  Goldberger's 
(1974)  multiple  indicators  model**  In  the  context  of  our  problem. 


*  It  has  long  been  recognized  that  the  errors-in-variables  problem  can 
be  reduced  by  averaging*  One  interpretation  of  the  multiple  indicators 
model  is  as  a  means  of  putting  different  explanatory  variables  in 
comparable  units — so  they  can  be  averaged* 
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Goldberger's  model  can  be  vrlcten  as  equations  (4)  to  (7) 


c  -  a  +  YB  +  e 


(4) 


X  -  Y  +  W 


(5) 


Y  -  PA 


(6) 


so 


X  -  PA  +  W  (7) 

E(FW')  -  0  E(WW')  -  Y  diagonal 

where  X  is  an  N  x  K  matrix  of  indicators,  A  is  an  L  x  K  matrix 
of  factor  loadings,  and  W  is  an  N  x  K  matrix  of  error  terms. 

In  our  application,  we  will  assume  that  the  many  dimensions  of 
output  can  be  represented  by  three  latent  variables,  distilled  from  15 
potential  indicators.  We  investigate  two  strategies  for  estimating  the 
three  latent  variables:  (1)  to  choose  three  of  the  indicators  and  enter 
them  as  explanatory  variables  in  an  OLS  regression  and  (2)  to  extract 
three  factors  from  the  15  indicators  and  enter  them  in  a  regression  as 
explanatory  variables.  Strategy  (1)  makes  no  correction  for  errors-in- 
varlables;  strategy  (2)  does  make  the  correction. 


Using  the  factors  themselves  Involves  a  well-known  indeterminacy, 
usually  associated  with  factor  "rotation.*9  As  Goldberger  (1978)  notes, 
the  product  FA  can  also  be  written  [FH][H~*A].  Thus,  if  F  is  a 
matrix  of  factors,  so  is  FH,  where  H  is  any  L  x  L  nonsingular 
matrix.  Since  F  is  unobservable,  it  has  no  natural  units.  The  choice 
of  H  can  be  viewed  as  putting  F  into  convenient  units.  We  chose  to 
put  F,  which  involves  three  variables,  into  the  units  of  the  three 
indicators  used  in  the  OLS  version,  since  this  makes  the  comparison  more 
intuitive.  It  turns  out,  however,  that  the  and  individual  errors 
would  be  unchanged  if  the  factors  were  multiplied  by  any  nonsingular 
matrix. 

III.  THE  APPLICATION  TO  NAVAL  BASES 

The  data  base  involves  expenditure  for  base  operation  at  144  naval 
bases  in  1979.  The  relationship  to  be  estimated  is  log  linear 

c  «  a  +  YB  +  e 

where 

c  is  the  natural  log  of  expenditure  at  base  i  for 

operating  support  (building  maintenance,  medical  services, 
and  other)  costs  of  base  operation,  as  opposed  to,  say, 
direct  pay  to  pilots  or  fuels  for  naval  planes) 


T  is  a  matrix  of  unobservable  latent  "outputs"  In 
logarithmic  form* 

We  start  with  IS  indicators  for  outputs: 

1*  Energy  consumption  of  the  base  in  1979  measured  in  BTUs; 

Includes  electricity,  coal,  natural  gas,  and  fuel  oil 
2*  Energy  consumption  in  1978 
3*  Building  area  in  square  feet  in  1979 
4*  Land  area  of  the  base  in  acres 
5*  Civilian  personnel  in  1979  (number  of  persons) 

6*  Civilian  personnel  (1978) 

7*  Active  military  personnel  in  1979  (number  of  persons) 

8*  .  Active  military  personnel  (1978) 

9*  Dependents  of  military  personnel:  dependents  located  off 
the  base  (number  of  persons) 

10*  Dependents  on  base  (1979) 

11*  Current  value  of  physical  plant  (dollars) 

12*  Retired  personnel  associated  with  the  base  (e.g«,  who  use 
the  base  hospital),  1979  (number  of  persons) 

13*  Active  personnel  in  1979  (FDR) 

14*  Civilian  personnel  in  1979  (FDR) 

15*  Reserve  personnel  who  use  base  (FDR). 

Except  those  noted,  all  of  the  data  for  this  study  come  from  the 
Domestic  Base  Factors  Report  (DBFR).  This  is  a  survey  of  a  sample  of 
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naval  bases  to  obtain  information  on  base  support  costs,  acreage,  build¬ 
ing  area,  and  a  variety  of  other  base  characteristics*  The  survey  has 
been  given  since  1977*  The  one  used  primarily  in  our  study  is  1979. 

The  sample  for  the  survey  is  chosen  to  include  all  the  largest  naval 
installations  in  the  U.S.  Our  sample  of  bases  included  144  of  the  bases 
in  the  DBFR.  To  arrive  at  our  sample  of  bases,  some  bases  from  the  DBFR 
sample  were  eliminated  because  of  missing  data  or  because  they  seemed 
one-of-a-kind . 

The  other  source  of  data,  the  FDR  (force  distribution  report)  is  a 
listing  of  billets  or  slots  assigned  to  particular  bases. 

Of  the  15  indicators,  the  three  chosen  for  the  OLS  run  were 

X^  is  the  natural  log  of  1979  energy  consumption  in  BTUs 
(taken  as  a  measure  of  activity  at  the  base) 

X£  is  the  natural  log  of  building  area  in  square  feet 

X3  is  the  natural  log  of  civilian  employment. 

Choosing  only  these  three  ignores  the  information  in  the  other  12 
indicators,  but  avoids  the  massive  collinearity  that  would  result  from 
using  all  15  indicators. 

The  factor  analytic  regression  replaced  X^,  X2,  X3  by 

AAA 

Xj,  Xj—the  appropriate  columns  from  the  estimated  FA.  As  noted 

earlier,  these  are  just  the  factors,  put  into  appropriate  units. 

Results  are  shown  in  table  1. 
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TABLE  1 
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A*  A  Comparison  of  Results 

Table  1  is  a  comparison  of  the  results  of  the  tvo  methods  (table 
1).  The  factor  analytic  version  has  a  higher  indicator  of  scale  (£0), 
•895  versus  .753,  i.e.,  the  deviation  from  constant  returns  is  reduced 
from  about  25  percent  to  about  10  percent*  This  is  consistent  with  the 
hypothesis  that  OLS  results  are  biased  from  error s-in-variables. 

The  regressions  are  similar  in  terms  of  R  ,  the  small  difference 
running  in  favor  of  the  OLS  version*  The  t-values  are  somewhat  lover  in 
the  factor  analytic  version  and  two  of  the  three  coefficients  have  an 
unexpected  sign*  These  are  both  consequences  of  increased  multicol- 
linearity  among  the  right-hand-side  variables.  The  attempt  to  purge 
measurement  error  has  eliminated  some  of  the  independent  variation  among 
the  Xsf  which  is  in  accord  with  the  theory  of  errors  in  variables  (in 
particular,  the  dlagonality  of  4/).  What  the  increased  collinearity 
suggests  is  that  factor  analysis  is  more  useful  in  obtaining  the  sum  of 

*  A  A 

coefficients  than  individual  coefficients.  Note  also  that  X^,  X2»  X^ 
behave  very  little  like  X^,  X£,  X3.  This  is  not  surprising  since  the 
X  variables  really  include  information  from  all  the  indicators. 

We  turn  now  to  the  question  of  whether  the  technique  has  changed 
measures  of  mean  inefficiency  for  the  sample  as  a  whole.  Two  such 
measures  are  discussed  in  Aigner,  Lovell,  and  Schmidt  (1977)  and  Lee  and 
Tyler  (1978).  An  aggregate  measure  of  inefficiency  for  the  sample 
observations  is  the  fraction  of  the  total  unexplained  variation  that  is 
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attributed  to  variations  In  efficiency: 


A  M  A  A  A  ^ 

X.  -  o4  /  (o4  +  o4) 

1  u  u  v 


A  second  Indicator  of  the  Importance  of  inefficiency  is  the  estimated 
expected  value  of  U. 

*2  -  '<»>  -  |  «„ 

•lace  Che  regression  Is  logarithmic,  IOOX2  is  approximately  equal  to 
percentage  effect  of  inef f iciency. 


The  estimates  of  aggregate  inefficiency  corresponding  to  the  two 
regressions  as  listed  in  table  3. 


TABLE  3 

MEASURES  OF  THE  IMPORTANCE  OF  AGGREGATE  INEFFICIENCY 


Formula 


OLS  on  three 
Indicators 


OLS  on  three 
factors 


As  can  be  seen  from  the  table*  the  primary  effect  of  the  correction 
for  measurement  error  Is  to  reduce  the  portion  of  the  error  attributable 
to  random  noise  and  increase*  both  relatively  (X^)  and  absolutely  (X2) 
the  portion  attributable  to  inefficiency.  Thus*  attempting  to  correct 
for  measurement  error  does  not  necessarily  reduce  measured  Inefficiency. 

CONCLUSION 

We  have  presented  an  illustration  of  how  frontier  relations  change 
when  an  attempt  is  made  to  reduce  measurement  error  via  factor  analy¬ 
sis.  The  results  were:  (1)  The  sum  of  the  coefficients  (the  indicator 
of  scale)  moved  in  the  direction  expected  if  measurement  error  were 
reduced;  (2)  the  purely  statistical  properties  of  the  estimates  did  not 
improve;  and  (3)  the  adjustment  for  measurement  error  did  not  reduce  the 
total  error  variance;  it  decreased  the  amount  attributed  to  randomness 
and  increased  the  amount  attributed  to  inefficiency.  To  summarize*  we 
did  find  evidence  of  errors-in-variables  and  that  errors-in-variables 
led  to  an  overstatement  of  scale  economies*  but  we  did  not  find  that 
errors-in-variables  led  to  an  overestimate  of  inefficiency. 
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